111 research outputs found
Participant observation of griefing in a journey through the World of Warcraft
Through the ethnographic method of participant observation in World of Warcraft, this paper aims to document various actions that may be considered griefing among the Massively Multiplayer Online Role-Playing Game community. Griefing as a term can be very subjective, so witnessing the anti-social and intentional actions first-hand can be used as a means to understand this subjectivity among players as well as produce a thorough recount of some of the toxic behavior in this genre. The participant observation was conducted across several years and expansions of World of Warcraft and the author became familiar with many griefing related actions; although some of these were perceived as acceptable game-play elements
The impact of environmental stochasticity on value-based multiobjective reinforcement learning
A common approach to address multiobjective problems using reinforcement learning methods is to extend model-free, value-based algorithms such as Q-learning to use a vector of Q-values in combination with an appropriate action selection mechanism that is often based on scalarisation. Most prior empirical evaluation of these approaches has focused on deterministic environments. This study examines the impact on stochasticity in rewards and state transitions on the behaviour of multi-objective Q-learning. It shows that the nature of the optimal solution depends on these environmental characteristics, and also on whether we desire to maximise the Expected Scalarised Return (ESR) or the Scalarised Expected Return (SER). We also identify a novel aim which may arise in some applications of maximising SER subject to satisfying constraints on the variation in return and show that this may require different solutions than ESR or conventional SER. The analysis of the interaction between environmental stochasticity and multi-objective Q-learning is supported by empirical evaluations on several simple multiobjective Markov Decision Processes with varying characteristics. This includes a demonstration of a novel approach to learning deterministic SER-optimal policies for environments with stochastic rewards. In addition, we report a previously unidentified issue with model-free, value-based approaches to multiobjective reinforcement learning in the context of environments with stochastic state transitions. Having highlighted the limitations of value-based model-free MORL methods, we discuss several alternative methods that may be more suitable for maximising SER in MOMDPs with stochastic transitions. © 2021, The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature
Softmax exploration strategies for multiobjective reinforcement learning
Despite growing interest over recent years in applying reinforcement learning to multiobjective problems, there has been little research into the applicability and effectiveness of exploration strategies within the multiobjective context. This work considers several widely-used approaches to exploration from the single-objective reinforcement learning literature, and examines their incorporation into multiobjective Q-learning. In particular this paper proposes two novel approaches which extend the softmax operator to work with vector-valued rewards. The performance of these exploration strategies is evaluated across a set of benchmark environments. Issues arising from the multiobjective formulation of these benchmarks which impact on the performance of the exploration strategies are identified. It is shown that of the techniques considered, the combination of the novel softmax–epsilon exploration with optimistic initialisation provides the most effective trade-off between exploration and exploitation
Intent-aligned AI systems deplete human agency: the need for agency foundations research in AI safety
The rapid advancement of artificial intelligence (AI) systems suggests that
artificial general intelligence (AGI) systems may soon arrive. Many researchers
are concerned that AIs and AGIs will harm humans via intentional misuse
(AI-misuse) or through accidents (AI-accidents). In respect of AI-accidents,
there is an increasing effort focused on developing algorithms and paradigms
that ensure AI systems are aligned to what humans intend, e.g. AI systems that
yield actions or recommendations that humans might judge as consistent with
their intentions and goals. Here we argue that alignment to human intent is
insufficient for safe AI systems and that preservation of long-term agency of
humans may be a more robust standard, and one that needs to be separated
explicitly and a priori during optimization. We argue that AI systems can
reshape human intention and discuss the lack of biological and psychological
mechanisms that protect humans from loss of agency. We provide the first formal
definition of agency-preserving AI-human interactions which focuses on
forward-looking agency evaluations and argue that AI systems - not humans -
must be increasingly tasked with making these evaluations. We show how agency
loss can occur in simple environments containing embedded agents that use
temporal-difference learning to make action recommendations. Finally, we
propose a new area of research called "agency foundations" and pose four
initial topics designed to improve our understanding of agency in AI-human
interactions: benevolent game theory, algorithmic foundations of human rights,
mechanistic interpretability of agency representation in neural-networks and
reinforcement learning from internal states
Function similarity using family context
Finding changed and similar functions between a pair of binaries is an important problem in malware attribution and for the identification of new malware capabilities. This paper presents a new technique called Function Similarity using Family Context (FSFC) for this problem. FSFC trains a Support Vector Machine (SVM) model using pairs of similar functions from two program variants. This method improves upon previous research called Cross Version Contextual Function Similarity (CVCFS) e epresenting a function using features extracted not just from the function itself, but also, from other functions with which it has a caller and callee relationship. We present the results of an initial experiment that shows that the use of additional features from the context of a function significantly decreases the false positive rate, obviating the need for a separate pass for cleaning false positives. The more surprising and unexpected finding is that the SVM model produced by FSFC can abstract function similarity features from one pair of program variants to find similar functions in an unrelated pair of program variants. If validated by a larger study, this new property leads to the possibility of creating generic similar function classifiers that can be packaged and distributed in reverse engineering tools such as IDA Pro and Ghidra.This research was performed in the Internet Commerce Security Lab (ICSL), which is a joint venture with research partners Westpac, IBM, and Federation University Australia
Language representations for generalization in reinforcement learning
The choice of state and action representation in Reinforcement Learning (RL) has a significant effect on agent performance for the training task. But its relationship with generalization to new tasks is under-explored. One approach to improving generalization investigated here is the use of language as a representation. We compare vector-states and discreteactions to language representations. We find the agents using language representations generalize better and could solve tasks with more entities, new entities, and more complexity than seen in the training task. We attribute this to the compositionality of languag
Hybrid intrusion detection system based on the stacking ensemble of C5 decision tree classifier and one class support vector machine
Cyberttacks are becoming increasingly sophisticated, necessitating the efficient intrusion detection mechanisms to monitor computer resources and generate reports on anomalous or suspicious activities. Many Intrusion Detection Systems (IDSs) use a single classifier for identifying intrusions. Single classifier IDSs are unable to achieve high accuracy and low false alarm rates due to polymorphic, metamorphic, and zero-day behaviors of malware. In this paper, a Hybrid IDS (HIDS) is proposed by combining the C5 decision tree classifier and One Class Support Vector Machine (OC-SVM). HIDS combines the strengths of SIDS) and Anomaly-based Intrusion Detection System (AIDS). The SIDS was developed based on the C5.0 Decision tree classifier and AIDS was developed based on the one-class Support Vector Machine (SVM). This framework aims to identify both the well-known intrusions and zero-day attacks with high detection accuracy and low false-alarm rates. The proposed HIDS is evaluated using the benchmark datasets, namely, Network Security Laboratory-Knowledge Discovery in Databases (NSL-KDD) and Australian Defence Force Academy (ADFA) datasets. Studies show that the performance of HIDS is enhanced, compared to SIDS and AIDS in terms of detection rate and low false-alarm rates. © 2020 by the authors. Licensee MDPI, Basel, Switzerland
A novel ensemble of hybrid intrusion detection system for detecting internet of things attacks
The Internet of Things (IoT) has been rapidly evolving towards making a greater impact on everyday life to large industrial systems. Unfortunately, this has attracted the attention of cybercriminals who made IoT a target of malicious activities, opening the door to a possible attack to the end nodes. Due to the large number and diverse types of IoT devices, it is a challenging task to protect the IoT infrastructure using a traditional intrusion detection system. To protect IoT devices, a novel ensemble Hybrid Intrusion Detection System (HIDS) is proposed by combining a C5 classifier and One Class Support Vector Machine classifier. HIDS combines the advantages of Signature Intrusion Detection System (SIDS) and Anomaly-based Intrusion Detection System (AIDS). The aim of this framework is to detect both the well-known intrusions and zero-day attacks with high detection accuracy and low false-alarm rates. The proposed HIDS is evaluated using the Bot-IoT dataset, which includes legitimate IoT network traffic and several types of attacks. Experiments show that the proposed hybrid IDS provide higher detection rate and lower false positive rate compared to the SIDS and AIDS techniques. © 2019 by the authors. Licensee MDPI, Basel, Switzerland
- …